[RFC ]Initial gRPC protocol for agent communication #2

gnawux · 2017-11-21T10:00:02Z

based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 .

and there are a few guys think it is better to put the protocol in an independent repo, what's your opinion?

laijs · 2017-11-21T10:21:15Z

pkg/agent/api/grpc/hyperstart.proto

+
+message CreateContainerRequest {
+	Container container = 1;
+	Process init = 2;


pls replace the 2 fields above which container_id.

Don't we need the configuration of the process, Or has it already defined by the OCI spec?

laijs · 2017-11-21T10:26:52Z

pkg/agent/api/grpc/hyperstart.proto

+	string type = 1;
+	uint64 hard = 2;
+	uint64 soft = 3;
+}


remove the above Container/Mount/Process/Rlimit.

rename User to StringUser.
StringUser is needed since the oci spec don't allow string user ids.

laijs · 2017-11-21T10:28:44Z

pkg/agent/api/grpc/oci.proto

+	string Version = 1;
+
+	// Process configures the container process.
+	OCIProcess Process = 2;


Remove the OCI prefixes for OCIProcess, OCIMount, OCIUser...

based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 Signed-off-by: Wang Xu <gnawux@gmail.com>

gnawux · 2017-11-21T10:55:59Z

updated based on the above comments

sameo · 2017-11-21T18:27:44Z

@gnawux You need to regenerate the *.pb.go files or else go test -v fails.

laijs · 2017-11-21T22:48:14Z

pkg/agent/api/grpc/hyperstart.proto

+}
+
+message ExecProcessRequest {
+	string container_id = 1;


add StringUser here

@laijs This would be redundant with the Process.User field, wouldn't it ?

quote from comments by @laijs above

StringUser is needed since the oci spec don't allow string user ids

correct this if anything wrong

Sure, to make container boot quickly, avoiding mounting the root block device on the host is very important.

@gnawux @laijs the OCI spec passes a uint32 for both UID and GID, but that should be enough. Why would a uint32 not be good enough for a UID/GID and why would we absolutely need a string?

@sboeuf That usename is used for windows only: opencontainers/runtime-spec@f9e48e0

It is a little weird if we reuse this variable. @gnawux do you accept it ?

We can always mount the block device inside the container to access to the /etc/passwd.

But we avoid do mount the block device in the host (for accessing to the /etc/passwd and filling the numerical user id in the spec).

the host might not have the ability to mount it. for example, if the block device is from ceph, and the host doesn't have krbd.ko. And even the host has it, userspace ceph lib + qume are alway the best choice.

Security, if the block device was once mounted inside the vm, the host should not mount it. the code in the vm might hack into the guest kernel, and modify the metadata of the filesystem of the block device. The host kernel might be also broken into when mounting the block device.

performance: to speed up the starting of the container, hyper avoid mounting the block device on the host.

@laijs not sure I follow everything here because I thought that /etc/passwd was passed through 9pfs through the shared directory. And in case /etc/passwd comes from the block device passed through the VM, it is mounted somewhere, meaning that you can specify libcontainer that you expect /etc/passwd to be mounted, right ?

@laijs just to be clear, I am not saying that I want to mount the block device on the host, I agree it would take some time. We don't do that either.
But the block device needs to be mounted inside the guest by the agent and then we can provide libcontainer with the right mounts so that it will mount /etc/passwd in the right place. This way, it would be able to know what 1000:1000 means for the user/group.

You are right.
Runtime --> agent --> libcontainer.
Runtime has to pass the string user name to agent if string name is specified by upper layer tools. Agent parses the string name from /etc/passwd and pass the digit user id to the libcontainer.
"Runtime --> agent" is the API, so we need the string user name in the API.

docker-runc does't allow the string user name, so the docker/moby does the parsing of the username. kata runtime cli will not allow the string user name either. But kata runtime can also be a lib for hyperd/frakti, the string user name will be passed to the agent in this case if requested.

laijs · 2017-11-21T23:05:36Z

pkg/agent/api/grpc/hyperstart.proto

+	rpc AddRoute(AddRouteRequest) returns (google.protobuf.Empty);
+	rpc OnlineCPUMem(OnlineCPUMemRequest) returns (google.protobuf.Empty);
+}
+


message Storage { string driver = 1; // empty in most cases. it will support "drbd", "bcache" ... string source = 2; // "/dev/sdb", "/dev/disk/by-scsi/xxxx", "none",... string fstype = 3; //"xfs", "ext4" etc. for block dev, or "9p" for shared fs, or "tmpfs" for shared /dev/shm for all containers ... repeated string options = 4; string mount_point = 5; // mount_point is only visible by VM, not by containers. This mount point can be used on oci.Mount.Source as "/Stroage/mount/point/{rootfs|_data}". }

I think this Storage + oci.Mount can reassemble current hyperstart's storage model.

@laijs Would you mind describing use cases for this Storage payload?

@sameo In short, for the backward compatibility to current hyperd/hyperstart.

The Storage derives from the volumes of hyperd and runv, in which we could insert a block device (the source, loop device or ceph rbd is ok as well) as a volume of container (as filesystem fstype, to the mount_point).

According to assumptions of the WIP CSI, the block devices would be inserted as block devices instead of mounted filesystems, i.e., for most current container images, there need a k8s init container to mount the block devices before the devices could be accessed.

The Storage struct here could be think as a shortcut for the volume init container behavior because the block devices are much more common in a virtualized world. Then we don't need run a volume init container before launch a container, especially for the case not working with k8s.

Any additional explanation, @laijs ?

@gnawux explained it in high level view. Here I focus on low level view:

We need the initial storage config for mounting the 9p which can be configured by this Storage in StartSandboxRequest.

oci.Spec can't ask you "mount /dev/sda1 mount to /kata/storage/xxxx and use /kata/storage/xxxx/rootfs as container rootfs".

oci.Spec can't ask you "mount /dev/sdd mount to /kata/storage/yyy and use /kata/storage/xxxx/_data as one of the container volume".

oci.Mount can't ask you mount a tmpfs on /kata/storage/shm, and bind it to all container's /dev/shm.

and more ....

@laijs I was about to ask for a shareDir parameter for StartSandbox since we need a way to pass some file and rootfs through 9pfs. Your Storage structure looks good because it is a generic way to pass some things to the VM through any type of filesystem that need to be shared accross all containers, and that are not hotpluggable actually.

Also, to give some more details here, we cannot completely rely on block devices to pass rootfs to the VM. In case of overlay, we don't have a block device and I don't think we want to spend a large amount of time preparing a block device based on this, it would consume too much time. Also I am not sure how it would work to reflect the changes on the block device on the overlay layers.
That's why having a mount point that can be shared when starting the VM would be the best option to support multiple cases.

@laijs we have renamed the driver field to format in runV, and the content should be stuff like raw, qcow2, dir... we don't mind it is bcache device or a loop file if they could be treat as a block device.

We use block devices for passing rootfs to the VM in several cases. Host storage drivers such as devmapper, (cow-)rawblock, ceph(hyper.sh) fit well into runv/hyperstart. In these cases, block devices containing container rootfs or volume are hotplug into the VM. In near future, DAX is going to be supported which also relies on block devices.

Only when the host storage drivers are overlayfs/aufs/btrfs, 9pfs is used for passing rootfs.
So both ways(block&sharedfs) are supported. @sboeuf

laijs · 2017-11-21T23:07:45Z

pkg/agent/api/grpc/hyperstart.proto

+}
+
+message CreateContainerRequest {
+	string container_id = 1;


+ repeated Storage
+ StringUser

Agreed with repeated Storage.

laijs · 2017-11-21T23:08:23Z

pkg/agent/api/grpc/hyperstart.proto

+
+message StartSandboxRequest {
+	string hostname = 1;
+	repeated string dns = 2;


+ repeated Storage

@laijs Yes, we need that one.

why we need it in both Create and Start API?

We need it for the StartSandbox for storage endpoints that are shared across all containers and in CreateContainer for container specific ones.

hmm...

should we change StartSandbox to CreateSandbox? It looks a bit strange to start without a create. @sameo @laijs

Here StartSandbox is actually a create and start operation. But I'm fine if we rename it to CreateSandbox(). It's also consistent with DestroySandbox().

I think the "Sandbox" has some runtime semantics and it is not persisted, which could be created as running, paused, or destroyed, and there should not be a created and stopped sandbox in our context.

@gnawux Right. So what I'm saying is that I'm fine if we only have:

rpc CreateSandbox(CreateSandboxRequest) returns (google.protobuf.Empty); rpc DestroySandbox(DestroySandboxRequest) returns (google.protobuf.Empty);

because in our case we're only going to create and destroy sandboxes.

Are you suggesting we should have:

rpc CreateSandbox(CreateSandboxRequest) returns (google.protobuf.Empty); rpc StartSandbox(StartSandboxRequest) returns (google.protobuf.Empty); rpc StopSandbox(StopSandboxRequest) returns (google.protobuf.Empty); rpc DestroySandbox(DestroySandboxRequest) returns (google.protobuf.Empty);

?

gnawux · 2017-11-22T00:31:00Z

@sameo I think I did, but I found someting different from the code you generated, which version of protoc are you using? And did you generated the .pb.go with the command line given in the hack/ dir?

sameo · 2017-11-22T10:20:36Z

hack/update-generated-agent-proto.sh

+#
+# SPDX-License-Identifier: Apache-2.0
+#
+protoc --proto_path=pkg/agent/api/grpc --go_out=plugins=grpc:pkg/agent/api/grpc pkg/agent/api/grpc/hyperstart.proto pkg/agent/api/grpc/oci.proto


I believe you also want to add a -I=$GOPATH/src/github.com/google/protobuf/src/ to this command. Without it, protoc seems to be unable to find the Empty message:

$ hack/update-generated-agent-proto.sh google/protobuf/empty.proto: File not found. hyperstart.proto: Import "google/protobuf/empty.proto" was not found or had errors. hyperstart.proto:15:62: "google.protobuf.Empty" is not defined. hyperstart.proto:16:60: "google.protobuf.Empty" is not defined. hyperstart.proto:17:54: "google.protobuf.Empty" is not defined. hyperstart.proto:18:58: "google.protobuf.Empty" is not defined. hyperstart.proto:25:52: "google.protobuf.Empty" is not defined. hyperstart.proto:26:56: "google.protobuf.Empty" is not defined. hyperstart.proto:29:56: "google.protobuf.Empty" is not defined. hyperstart.proto:30:60: "google.protobuf.Empty" is not defined. hyperstart.proto:31:62: "google.protobuf.Empty" is not defined. hyperstart.proto:32:48: "google.protobuf.Empty" is not defined. hyperstart.proto:33:56: "google.protobuf.Empty" is not defined.

Am I missing something ?

➜ runtimes git:(grpc_proto) protoc --version libprotoc 3.5.0 ➜ runtimes git:(grpc_proto) hack/update-generated-agent-proto.sh ➜ runtimes git:(grpc_proto)

Works on my Mac.

@sameo Ah I know why you got the Empty errors.

My protoc is built from source and installed together with the include files.

If you download the prebuilt protoc, and only put the bin/protoc binary in your $PATH without install the include/ dir, you will get the import error.

However, I have not generated the code with the getters in the diff yet.

@gnawux I installed protoc from my distro protobuf packages (FC27, protobuf and protobuf-compiler)

$ rpm -q protobuf-compiler protobuf-compiler-3.3.1-2.fc27.x86_644

@sameo Then does it have an -devel or -dev like package?

And we need figure out why they generate different code.

finally I solved the getter issue. I found an older protoc-gen-go in a higher priority position in my $PATH

sameo · 2017-11-22T10:25:12Z

@gnawux I'm using:

$ protoc --version
libprotoc 3.3.1

And when generating the pb.go files with the hack/ script, I get the following diff https://gist.github.com/sameo/acb737c5370957afb96e5e7d905d1b24

gnawux · 2017-11-22T10:28:28Z

@sameo I found the diff before I post the PR as well, and I checked many things but didn't get what's missing...

laijs · 2017-11-23T01:40:47Z

I think it seems better to create a new repo for the api, such as agent-api.

Or move this api into the agent repo.
There are 3 repos using the api: runtime, shim, agent.
runtime --api--> agent. shim --api--> agent. (runtime--spawn-->shim)
It seems agent is more appropriated to host the api than the runtime.

sameo · 2017-11-23T08:52:02Z

@laijs @gnawux

It seems agent is more appropriated to host the api than the runtime.

I agree with that. @gnawux Would you mind sending this PR to kata-containers/agent instead?

gnawux · 2017-11-23T08:55:55Z

@sameo no, @laijs and I discussed the location issue before, we notice the problems when I create the PR. or should we have an independent repo for protocols?

sameo · 2017-11-23T08:59:53Z

@gnawux Let's use the agent repo as it's the one that makes the most sense for now.

gnawux · 2017-11-23T09:01:30Z

Then will close this PR after update it to the agent repo

based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and kata-containers/runtime#2

based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and kata-containers/runtime#2 Signed-off-by: Wang Xu <gnawux@gmail.com>

gnawux · 2017-11-23T10:14:45Z

close this PR and moved the protocol code to kata-containers/agent repo

…iles scripts: Remove scripts for now.

virtio-mmio: Add support for virtio-mmio

sync fork after merge clh driver #1

laijs reviewed Nov 21, 2017

View reviewed changes

Initial gRPC protocol for agent communication

7cb7865

based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 Signed-off-by: Wang Xu <gnawux@gmail.com>

gnawux force-pushed the grpc_proto branch from 9637f24 to 7cb7865 Compare November 21, 2017 10:55

laijs reviewed Nov 21, 2017

View reviewed changes

gnawux mentioned this pull request Nov 22, 2017

initial shim implementation kata-containers/shim#1

Merged

5 tasks

sameo reviewed Nov 22, 2017

View reviewed changes

gnawux added a commit to gnawux/kata-agent that referenced this pull request Nov 23, 2017

Initial gRPC protocol for agent communication

5233ccc

based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and kata-containers/runtime#2

gnawux added a commit to gnawux/kata-agent that referenced this pull request Nov 23, 2017

Initial gRPC protocol for agent communication

b79c7d4

based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and kata-containers/runtime#2 Signed-off-by: Wang Xu <gnawux@gmail.com>

gnawux mentioned this pull request Nov 23, 2017

[RFC] Initial gRPC protocol for agent communication kata-containers/agent#2

Merged

gnawux added a commit to gnawux/kata-agent that referenced this pull request Nov 23, 2017

Initial gRPC protocol for agent communication

4de0d74

based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and kata-containers/runtime#2 Signed-off-by: Wang Xu <gnawux@gmail.com>

gnawux added a commit to gnawux/kata-agent that referenced this pull request Nov 23, 2017

Initial gRPC protocol for agent communication

22a3236

based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and kata-containers/runtime#2 Signed-off-by: Wang Xu <gnawux@gmail.com>

gnawux closed this Nov 23, 2017

jodh-intel pushed a commit to jodh-intel/runtimes that referenced this pull request Mar 14, 2018

Merge pull request kata-containers#2 from jcvenegas/cli-remove-data-f…

0acd31a

…iles scripts: Remove scripts for now.

egernst mentioned this pull request May 10, 2018

kata containers memory footprint #295

Closed

mcastelino referenced this pull request in mcastelino/runtime-1 Dec 6, 2018

Merge pull request #2 from mcastelino/topic/virtio-mmio

8e12712

virtio-mmio: Add support for virtio-mmio

ericooper mentioned this pull request Nov 19, 2019

Merge pull request #1 from kata-containers/master #2215

Closed

devimc pushed a commit that referenced this pull request Nov 22, 2019

Merge pull request #2 from kata-containers/master

164fa18

sync fork after merge clh driver #1

pohly mentioned this pull request Dec 10, 2019

support PMEM inside Kata Containers when running under Kubernetes #2262

Closed

LinShuicheng mentioned this pull request Jun 28, 2020

[question] does kata support shareProcessNamespace? #2788

Closed

magnate3 mentioned this pull request Oct 9, 2020

qemu-system-aarch64: rom check and register reset failed: unknown #2995

Closed

egernst mentioned this pull request Feb 24, 2021

hugepage usage with kubernetes and kata-container #2353

Closed

[RFC ]Initial gRPC protocol for agent communication #2

[RFC ]Initial gRPC protocol for agent communication #2

Conversation

gnawux commented Nov 21, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gnawux commented Nov 21, 2017

sameo commented Nov 21, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sameo Nov 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

laijs Nov 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

laijs Nov 21, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

laijs Nov 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

laijs Nov 21, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sameo Nov 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sameo Nov 23, 2017 • edited Loading

Choose a reason for hiding this comment

gnawux commented Nov 22, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sameo Nov 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sameo commented Nov 22, 2017

gnawux commented Nov 22, 2017

laijs commented Nov 23, 2017

sameo commented Nov 23, 2017

gnawux commented Nov 23, 2017 • edited Loading

sameo commented Nov 23, 2017

gnawux commented Nov 23, 2017

gnawux commented Nov 23, 2017

sameo Nov 23, 2017 •

edited

Loading

laijs Nov 30, 2017 •

edited

Loading

laijs Nov 21, 2017 •

edited

Loading

laijs Nov 22, 2017 •

edited

Loading

laijs Nov 21, 2017 •

edited

Loading

sameo Nov 23, 2017 •

edited

Loading

sameo Nov 23, 2017 •

edited

Loading

sameo Nov 22, 2017 •

edited

Loading

gnawux commented Nov 23, 2017 •

edited

Loading